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ABSTRACT 



A model is presented for the prediction of future organ- 
ization size based on the numb -rs of recruits entering the 
organization in the past. This model utilizes the correla- 
tion between populations of successive time periods in 
order to better estimate the future remaining personnel in 
the system. The number of personnel leaving from each 
recruit cohort is assumed to follow the same probability 
distribution, which is a function of the age of the cohort 
in the organization and the grade in which the cohort 
started. For large cohort sizes the total personnel in the 
system is approximately normally distributed. This result 
justifies the use of a best linear prediction method which 
takes into account past errors of estimating the continuing 
population from one period to the next. Sensitivity of 
predictions to errors in probability estimates is discussed. 
The model is applied to predicting university student 
enrollment. Comparison of predicted and actual student 






library 

NAVAL POSTGRADUATE SCHOOB 
MONTEREY, CALIF. 93940 

TABLE OF CONTENTS 

I. INTRODUCTION 7 

II. MODEL 11 

III. MEETING SPECIFIED GOALS 22 

IV. SENSITIVITY TO ERRORS IN PROBABILITY 

ESTIMATES 3 O 

V. APPLICATION TO UNIVERSITY ENROLLMENT 34 

APPENDIX A: REMAINDER TERMS TO PREDICTION 

ERROR 40 

BIBLIOGRAPHY 42 

INITIAL DISTRIBUTION LIST 43 

FORM DD 1473 45 



3 



LIST OF TABLES 



I. New Enrollments at the University of 

California, Berkeley 35 

II. Fractions of Students Attending Each 

Successive Fall After Enrollment 36 

III. Values Used in Prediction 37 

IV. Predicted Total Fall Enrollment 39 



5 



I. INTRODUCTION 



In many large corporations and institutions with a high 
rate of personnel turnover, a crucial problem In recruitment 
planning is that of predicting from one period to the next, 
how many personnel presently In the organization will 
remain. When the length of service of a single member is 
fixed, or completely controlled by the organization, the 
problem is trivial. However, when the length of service 
Is variable, such as with middle management in large 
corporations and the military' service, or university student 
bodies, probabilistic arguments must be used to estimate 
expected attrition. 

The theory of Markov chains has been widely used In 
prediction models. A basic assumption In such models is 
one of statlonarlty of rates of movement within a system 
of defined states. In order to assess the transition 
probabilities between various states in the system, it is 
necessary to Identify various characteristics of personnel 
In the organization in the past to make predictions in the 
future. A number of such models can be found In the llter-^ 
ature, including Bartholomew [196?] > and Thonstad [I 968 ] . 

Of particular Interest is a paper by McAfee [1970] » 
which describes a different type of prediction model, the 
so-called cohort model. For this model McAfee tests the 
statlonarlty properties of the distribution of the remaining 



fraction of an Initial cohort size in subsequent periods 
after entry into the organization. This was done for 
three different cohorts which entered the system in three 
adjacent periods. This property will be assumed to hold 
in the prediction model of this paper. When the cohort 
sizes vary from period to period, McAfee showed that the 
Markov Models alluded to above do not accurately describe 
movement of personnel in the system. Since later in this 
paper we consider new cohort sizes as control variables, 
it is not meaningful to consider them constant in size over 
time, and hence we concentrate in this paper on the cohort 
model and analyze some of its characteristics. 

A basic assumption of this cohort model is that all 
members of a given cohort behave independently of each 
other, and each member's lifetime in the system is a real- 
ization from a common stochastic process. These assumptions 
lead to the binomial distribution for predicting the con- 
tinuing fraction of a cohort with a given age in the system. 

The cohort prediction model views the number of per- 
sonnel in the system as a superposition of continuing 
portions of past cohorts. We assume that the behavior of 
a member in one cohort is Independent of, but from the same 
distribution as, that of a member in a different cohort. 

(By the word different, is meant different time of entry 
into the system. ) To estimate the continuing portion of 
the present organization size for the next period, we sum 
the continuing portions of the past cohorts, basing the 
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expected number from each cohort on the size at entry and 
age in the organization. (viz., the sum of expected values 
of independent random variables.) 

As time elapses in the system for a given cohort, the 
number remaining in the organization from one period to 
the next is dependent on the number remaining from the 
previous period. There exists a correlation, which is 
positive, between the remaining portion of a single cohort 
in one period with the remaining portion of that same 
cohort in the previous period. This correlation is cumu- 
lative when we examine the correlation between the sum of 
continuing portions in one period with the sum of the 
continuing portions of the past period. It is this corre-r 
latlon property of the model which will be of special 
significance in improving the prediction characteristics 
of the model. 

Another property of the model which will be proved and 
used to advantage is that with large cohort sizes, the 
distribution of the sum of continuing cohort portions 
asymptotically approaches a normal distribution. This 
allows us to derive simple tractable formulae for predicting 
the number in the system in a given time period, given the 
number present in the previous time period, with no detailed 
knowledge of from which cohort the various members came. 
Without this property, the exact expression for the expected 
value of the organization size next period, given the 
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realization of the present size. Is Intractlble. A best 
linear predictor for the expected value of the organization 
size next period Is this conditional expectation when the 
organization size Is a normal random variable. 

Mathematical expressions for the prediction of future 
organization size, knowing the size of each past and present 
cohort are derived. A decomposition of the organization 
Into grades Is then made to predict future grade sizes 
within the organization. Finally, an application of the 
model Is made to the university student enrollment problem, 
where the desired prediction Is that of total enrollment 
using data on past periods for new enrollments. This appli- 
cation Is made with data from the University of California, 
Berkeley, during the period I 96 I to 1969* For this model 
there are I 6 lifetime, distributions which repeat yearly; 
one each for freshman, sophomore, junior and senior new 
students admitted Into each of four academic quarters. 
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II . MODEL 



Denote by X^(u) the number of persons who enter an 
organization at time u In state 1. Let K be the number of 
grades in the organization and M be the number of time 
periods (epochs) beyond which no member can remain in the 
organization. Let the probability of a single member 
remaining until at least s epochs have elapsed (since entry 
into the system), be p^^(s), and let X^(u,s) be the number 
who entered in grade i at time u who are still in the organ- 
ization at time u+s . Then X^(t-s,s) is the number in the 
system at time t of those that entered at time t-s in grade 
1 , a binomially distributed random variable with parameters 
X^(t-s) and p^(s). The expected value of X^(t-s,s) is 
p^(s ) *Xj^(t-s ) and the variance is X^(t-s )p^(s )q^^(s ) , where 
q^(s) is l-Pj^(s). When the time elapsed since cohort entry, 
s, reaches the value M, the expected value and the variance 
of Xj^(t-M,M) are both zero. It is assumed that the behavior 
of members of a cohort which entered the system at time t, 
is Independent of those in a cohort which entered at time 
u, u?^t. It is also assumed that for i 7 ^ j , the behavior of 
the members of X^^(t) is unaffected by that of members of 
Xj(t). 

Denote by Y^(t) the number of persons present at time t 
who started in grade 1 , and by Y(t) the entire population 
present in the organization at time t. The entire 
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organization size, Y(t), at time t. Is the sum of all 
remaining portions of cohort sizes which started In each 
of the K grades, counting back M epochs from the present 
time. Thus we have that 

K KM 

Y(t) = I Y (t) = E E X. (t-s,s) . 

1=1 ^ 1=1 s=0 ^ 

The expected value and variance of Y(t) are then respect- 
ively, 

KM 

E(Y(t)) = E EX. (t-s) p. (s) 

1=1 s=0 ^ ^ “ 

arid 

K M 

Var(Y(t)) = E E X. (t-s) p. (s) q. (s) , 

1=1 s=0 ^ ^ 

where Pj^(O) = 1 and q^(0) = 0. 

' Our objective Is to find expressions for the conditional 
expectations, E [Y ( t+1 ) | Y( t ) ] and E [Y^ (t+1 ) | Y^ (t ) ] , for 
which we need the distributions of Y(t) and Y^(t). Although 
the first and second moments of Yj^(t) and Y(t) have simple 
forms, explicit formulae for their distribution functions 
are very unwieldy. However, If all Initial cohort sizes 
are large for all time t, then the distributions for Y^(t) 
and'Y(t) are asymptotically normal. This follows from 
Theorem One and the fact that the Y’s are sums of Indepen- 
dent random variables. 
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Theorem One ; 

If lei. Is a family of random variables, each 

Independent and each distributed blnomlally with parameters 

p. and N. (for 0<p.<l and N.>0), where E then for 

^ \ lei ^ 

W = E (X.-N.p.)/( 2 N.p.q.)^> the distribution of W 
iel 1 lei 111 

asymptotically approaches that of a random variable which 
Is distributed as normal (0,1) as N^-»-“VieI. 

Proof: In order to prove the theorem, it Is sufficient to 

show that the moment generating function, G (m) , 

1 2 ^ 
approaches the limit e^ as Nj^^“>\/iel. 

1. Let B = ( E N.p.q.)"^ 
lei ^ ^ ^ 

X “ 

a. The expansion of e is E — and for x small, 

j=0 J* 

x^ 2 

is 1+x + — + o( X ) . 



b. For X small, the term (1+x)*^ can be represented 



by e*^^. 

m E (X^-N^Pj_)/Bj 

2. G (m) = E(e"^'^) = eU 
w 



m(X^-N^p^)/B 



= n E\e 
iel 



N . -N . p . m/B 
^ , m/B ^ s 1 1*^1 

= n (p. e + q. ) e 

lei ^ ^ 



\ 



q . m/B -p . m/B 

lei 



N. 

1 



q^^m/B 

3. Prom la, the expansion of p^e Is Pj^[l + q^m/B 

o ? -mp • /B 

+ + o(l/B )] and the expansion of q^e ^ 

Is q^[l - Pj^m/B + ^^(p^in/B)^ + o(l/B^)]. Thus, 



13 







-p^m/B 



= (Pj^+Qj^) + (Pj_qj^-Pj^qj^)m/B 



+ *sPj^qj_(qj^+Pj_) (m/B)^ + o(l/B^) 
= 1 + ^p,q,(m/B)^ + o(l/B^) . 

J. -L 



p 

. Prom lb, and using the fact that the term ^p^q^(m/B ) 

is small for large values of B, and neglecting 
2 

o ( 1/B ) , we have 



( 




q^^m/B 




-p^m/BK p^q^N^(m/B)2/2 



, and hence 



G (m) 
w 



n e 
iel 



^PiqiNi (m/B)^ 



Z igp.q.N. (m/B)2 

iel ^ ^ 1 



1^1 1 






2 



e 



e 



in the limit as tends to infinity for all i in I, 

since ( 1/B) ^ . Z p . q . N . ) =1. □ 
lel 111 

It is now possible to examine how best to estimate 

expected future values of Y and when past realizations of Y 

and Y. are known. When the conditional expectation of a random 
1 

variable cannot be found explicitly, Parzen (I960) suggests a 
best linear predictor which minimizes the mean square error of 
prediction using a linear function of previous realizations. 
When those random variables are normal, this function 
gives the exact conditional expectation of the random 
variable, given the value in the previous time periods. 
Theorem One justifies’ our use of the best linear predictor. 
In Theorem Two, we derive the general expression for this 
predictor, which we later specialize to our cohort model. 



Theorem Two: 



A best linear predictor is defined to be that function 

E*(Y) = a + bX, for Y and X random variables, 
which minimizes E [ (Y-E* [Y] )^] , denoted Var*(Y). The expres- 
sions for E*(Y) and VAR*(Y) are 

E*(Y) = E(Y) + [Cov(Y,X)/Var(X) ] [X-E(X) ] and 

Var*(Y) = Var(Y) (l-p^(X,Y)), where the term p(X,Y) 
denotes the correlation coefficient of X,Y. 

Proof : 

1. Var*(Y) = E[(Y-(a+bX) )^] . 

2. ^ Var*(Y) = -2E [ Y- (a+bX) ] = -2E[Y] + 2(a+bE[X]), 
which when set to zero, yields a = E[Y] - bE[X]. 

3. ^ Var*(Y) = -2E [XY-X( a+bX) ] = - 2E[XY] + 2aE [X] + 2bE [X^], 

which, when set to 0 and the substitution for a is made, 

yields b . ElXYl - EtXlECYl ^ Oovix.yi/Varcxi. 

E[X ] - E [X] 

4. Substituting for a and b in the expression for E*(Y) 
and Var*(Y), E*(Y) = E[Y] - bE[X] + bX 

= E[Y] + (Cov[X,Y]/Var[X] ) (X-E[X] ) and 

Var*(Y) = E[(Y-E* [Y] )^] = b^E[(X-E[X] ]^)- 2bE[(X-E(X))(Y-E(Y))] 
= Var[Y] + b^Var[X] - 2bCov[X,Y] 

= Var[Y] + [Cov(X,Y)/Var(X)]^Var [X] 

- 2(Cov[X,Y]/Var[X] )Cov[X,Y] 

= Var[Y] [1-Cov^ [X,Y]/(Var[X]Var[Y] )] 

= Var[Y] d-p^[X,Y] ) . □ 
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In a similar manner, E*»(Y) = E(Y)+b^(X-E(X))+b 2 (Z-E(Z)) 
minimizes Var** ( Y ) = E [ ( Y-E** ( Y ) ) ^ ] when 

b^ = D[Cov(X,Y)/Var(X)- Cov( Y , Z ) Cov(X, Z )/{Var(X)Var( Z ) } ] , 
b^ = D[Cov(Z,Y)/Var(Z) - Cov( Y ,X) Cov(X, Z )/{Var (X)Var ( Z ) } ] , 

and 

D = l/(l-p^[X,Z] ). 

The expression for Var**(Y) then becomes 

Var**(Y) = Var(Y) + b^^Var(X) + b 2 ^Var(Z)+ 2b^b2Cov(X , Z ) 

- 2b^Cov(Y,X) - 2b2Cov(Y,Z). 

We now have an expression for the best linear predictor 
for Y(t) when we know the past realizations, Y(t-l) and 
Y(t-2), regardless of the distribution of Y(t). For large 
cohort sizes at entry Into the system, we have that this 
predictor ^ best (l.e.. It Is E [Y ( t ) | Y ( t-1 ) , Y( t-2 ) ] ) , 
since the distribution of Y(t) will be very close to normal. 
Hence the functions E* and E** are actually conditional 
expectations given the past period error and the last two 
period errors respectively. 

The only terms used In expressing E* and E** which have 
not been derived are the Cov [Y( t ) , Y( t-1) ] and Cov [Y( t ) ,Y(t-2 )] . 
Since Independence exists between X. (t-y,y) and X.(t-y,y) 
for 1 j , and Independence exists between X^(t-s,s) and 
X^(t-u,u) for u 7 ^ s , the covariance between all such X terms 



16 



Is zero except for that between X^(t-s,u) and X^(t-s,w). 
Here, for w>u, we may Interpret X^(t-s,u) to be the remnants 
of the Initial size at entry, X^(t-s); and Xj^(t-s,w) to be 
the remnants, some w-u epochs later, of X^(t-s,u). Given 
that a member of X^(t-s) remains until a time u epochs 
later, the probability of his remaining until w epochs after 
entry into the system is p^(w)/p^(u). (Assuming a person 
who leaves the system does not return, 0^[pj^ (w)/p^ ( u) ] <1 
for w>u.) The conditional expectation of X^(t-s,w) given 
the realization of X^(t-s,u) is then [p^(w)/pj^(u) ]X^(t-s ,u) . 
Hence the covariance of X^(t-s,u) and X^(t-s,w) is derived 
by the following argument: 

E[X^(t-s,u) ,X^(t-s,w) ] = [p^(w)/p^(u)]E[X^(t-s,u)^] ; 
Cov[X^(t-s,u) ,X^(t-s,w)] = [pj^(w)/p^(u) ]E[X^(t-s,u)^] 

- E[X^(t-s,u)]E[Xj_(t-s,w)] , 

where 

E[X^(t-s,u)] = p^(u)X^(t-s) and 
E[X^(t-s,w)] = p^(w)X^(t-s); 

and 

E [Xj^( t-s ,u) ]E [X^(t-s ,w) ]= p^(w)p^(u) [X^(t-s) ] ^ 

= [Pj^(w)/p^(u)] [Pj^(u) [X^(t-s) 

= [Pj^(w)/p^(u) ] (E[X^(t-s)] }^. 
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Hence, 

Cov[X^(t-s,u) ,X^(t-s,w)] = [p^(w)/p^(u) ]E [X^(t-s ,u)^] 

~[p^(w)/p^(u) ]E^ [X^(t-s,u) ] 

= [p^(w)/p^(u) ]Var [X^(t-s ,u) ] 

= [p^(w)/p^(u) ]p^(u)q^(u)X^(t-s) 

= p^(w)q^(u)X^(t-s ) . 

To express the covariance between X^(t-s,s) and the remain- 
ing number of the cohort with age s-1 (one period ago), 
X^(t-s,s-l), we substitute w = s and u=s-l to obtain 

Cov [X^ ( t-s , s ) ,X^ ( t-s , s-1 ) ] = p^ ( s )q^ (s-1 )Xj^ ( t-s ) , for s^l. 

Similarly, by substituting w = s and u= s - 2, 

Cov[X^(t-s,s) ,X^(t-s,s-2) ] = Pj^(s)qj^(s-2)Xj^(t-s), for si2. 

To obtain Cov[Y^(t) ,Y^(t-l) ] , the covariance existing 
between all the remnants at time t from cohorts which entered 
grade 1 and the remnants from the same cohorts at time t-1, 
we sum on s from 1 to M to obtain, 

M 

Cov[Y. (t) ,Y. (t-1)] = I Cov[X (t-s,s),X. (t-s,s-l)] 

^ ^ s=l ^ 

M 

= E p. (s)q. (s-l)X. (t-s) . 
s=l ^ ^ 
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Similarly, (mutatis mutandl) 



Cov[Y^(t),Y^(t-2)] 



M 

T. Cov[X. (t-s ,s) ,X. (t-s ,s-2) ] 
s = 2 ^ ^ 



s=2 



M 

Z p^(s)q^(s-2)X^(t-s ) . 



The Independence between cohorts starting in different 
grades has already been established; thus, to obtain the 
covariance between all remnants at time t with all remnants 
at time t-1 and with all remnants at time t-2 , we sum on i 
from 1 to K the covariances of the Y^(t) terms to obtain, 
respectively , 



To summarize, the expressions for estimating the 
remaining persons in the system who started in grade 1 are 



K 

Cov[Y(t) ,Y(t-l) ] = E Cov[Y. (t) ,Y. (t-1) ] 

1=1 ^ ^ 



K M 
2 Z 
1=1 s=l 



M 

Z p^(s)q^(s-r)X^(t-s) 



and 



K 

Cov[Y(t) ,Y(t-2) ] = Z Cov[Y, (t) ,Y. (t-2)] 

i=l ^ ^ 



K M 
Z Z 
i=l s=2 



M 

Z p^(s)q^(s-2)X^(t-s) . 
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where 



= Cov[Y^(t),Y. (t-l)]/Var[Y. (t-1)]; 

and 

E**(Y (t)) = E[YAt)] + bj- , {Y. (t-1) - E[Y, (t. 

+ - E[Y. (t-2)]}, 

for 

b} ^ = d(A^-A 2 A 2 ) and ^ = d(A 2 ~A^A 2 )j 
where 

d = l/U-p^[Y^(t-l),Y^(t-2)]}, 

A'^ = Cov[Y. (t),Y^(t-l)]/Var[Y. (t-1)] = 

= Cov[Yj, (t-1) ,Y^(t-2)]/Var[Y^(t-2)] = b^^ 

and 

A^ = Cov[Y^(t),Y^(t-2)]/Var[Yj^(t-2)] . 

The expressions for estimating the number pf 
remaining persons are then 

E*(Y(t)) = E[Y(t)]'+ b^{Y(t-l) - E[Y(t-l)]} 

and 

E**(Y(t)) = E[Y(t)] + bJ{Y(t-l) - E[Y(t-l)]} 
+ b^{Y(t-2) - EtY(t-2)]}; 



1 )]} 



-1 



total 
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where 



= Cov[Y(t) ,Y(t-l)]/Var [Y(t-D] 

K K 

= Z Cov[Y. (t),Y. (t-1)]/ { Z Var[Y. (t-1)]}, 

1=1 ^ ^ 1=1 ^ 

bj = D{b^ - b^_^CovtY(t) ,Y(t-2)]/Var[Y(t-2)] }, 
b^ = D{Cov[Y(t),Y(t-2)]/Var[Y(t-2)] - 

and 

D = 1/{1 - p^[Y(t-l),Y(t-2)]}. 

This model for predicting the total number In the system 
at time t Is applied to the problem of predicting total 
student enrollment In Section V, 
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III. MEETING SPECIFIED GOALS 



Consider an organization In which desired numbers are 

specified for personnel in each grade 1 at times t+l, 

t+2, ... In the future. It Is the objective of this 

section to extend the model of Section II for predicting 

the future grade size within the organization. 

Define p^j(t,u) to be that fraction of personnel In 

grade j at time u given they entered the system In grade 1 

at time t, where t<u. If the grades are numbered In 

hlerarchal order, l<j means a promotion, i>j means a 

demotion and l=j means that p^j^(t,u) Is the fraction 

which remain In grade 1 In the period t to u. If the rates 

of movement between grades are stable for the time period 

being considered, an assumption of statlonarlty or 

Independence of time t can be made about the transferred 

fractions, and p..(t,u) can be expressed as p..(u-t); that 

1 J 1 J 

Is, the fraction transferred from grade 1 to j Is a func- 
tion only of the elapsed time u-t. Let Xj^j(t,u) denote 
the number In grade j at time u of that cohort which 
entered the organization at time t In grade 1. We then 

have that X..(t,u) is a blnomlally distributed random 
1 J 

variable with parameters X. (t) and p..(u-t). 

_L J- J 

By representing the number of persons In grade j at 
time t as Y'^(t) In terms of the cohort portions remaining 
from the initial size at entry Into grade 1 In all previous 
epochs, we sum on 1 and sum on s to obtain 
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M K 

yJ (t) = z z X. . (t-s ,t) . 
s=0 i=l 

It should be noted that X. .(t-s,t) is Independent of 

t J 

X^j(t-r,t) for s 7 ^ r, since the cohort entering grade i 
at time t-s acts independently of the cohort entering 
grade i at time t-r. Note also that Xj^j(t-s,t) is inde- 
pendent of X. .(t-Sjt) for i?^k, since these are the rem- 

_L J 

nants presently in grade j at time t of those who 

entered the system at time t-s in different grades. For 

s = 0, X^j(tjt) is 0 when 1/j and X^^^Ctjt) is Xj^(t); 

that is, a cohort which just entered the system at time t 

in grade i will have no opportunity to diminish in size 

or to be transferred to another grade in the same time 

period. Of note is the fact that independence does not 

exist between Y*^(t) and Y (t) for j 7^5., since in each of 

these random variables, there may exist people from the 

same initial cohort. At this point, the distinction 

between Y^(t) and Y^(t) should be clear. In expressing 

Yji^(t), we are counting the remnants of X^(t-s) in all 

grades. In expressing Y^(t), we are counting the number 

presently in grade 1 as remnants from all previous time 

periods of cohort entries into all grades. 

We are now able to express the means and variances of 

Y'^(t) as the sum of independent means and variances 

respectively of X.,(t-s,t); 

t J 

KM KM 

EtY'^Ct)] = E[ Z Z X..(t-s,t)] = Z Z p. . (s)X. (t-s) 
1=1 s=0 1=1 s=0 ^ 
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and 



K M 

Var[YJ(t)] = Z Z p,.(s)q..(s)X.(t-s) for j =1 , 2 , . . . K 
i = l s = 0 ^ 



where 

p . . ( 0 ) = 1 , and q . , ( s ) = 1 - p . . ( s ) . 
11 1 J J 



Denote by G^(t+l)j G^(t+2)j... the desired goals for 

the size of grade 1 at times t+l, t+2, ... respectively. 

Denote by Z^(t+l)j Z^(t+2),... the numbers in grade i at , 

times t+1, t+2j... which were in the system at time 

tj t+1,..., in one of the K grades of the organization. 

The number X^(t+1) is now a controlled variable; that is, 

the number to recruit into grade 1 at time t+1 in order to 

attain G^(t+1) is X^(t+1) = G^(t+1) - Z^(t+1). 

We now assume that no demotions take place; p. .(s) is 

t J 

zero for j<l. At the end of each epoch, a member is 
promoted, remains in present grade or leaves the organi- 
zation. With these restrictions placed on the system, we 
have that the expected values for Z.(t+1) and Z.(t+2) are 

1 M 

E[Z. (t+1)] = Z Z X,(t-s) p., (s) 

^ j=l s=0 J 



and 



1 M 

E[Z.(t+2)] = Z Z X. (t-s+l)p . . (s) , for 1 = 1,2,. ..K. 
^ j=l s=0 J 



24 



The variance terms are 



1 M 

Var[Z. (t+1)] = Z E X . ( t-s )p . . ( s )q . . ( s ) 

^ j=l s=0 J 

and 

1 M 

Var [Z . ( t+2 ) ] = Z Z X . ( t-s+1 )p . . ( s )q . . ( s ) . 

^ j = l s=0 

With the same definition as in Section II for Pj^(s) (that 

is, the fraction of X. (t-s) remaining in the system at 

^ K 

time t), we have a restriction that Z p..(s) = p.(s) 

j=l ^ 

for all values ofs (l.e., s=0,l,...M). 

Suppose our problem is to avoid overmanning or under- 
manning grade 1 (1=1, 2,... K) in the periods t+1, t+2, when 
the recruiting for these periods must be planned ahead. 

Assume that there are penalties defined when the number 
of personnel in each grade is above or below the goals; 
i.e., C^"^(t) when the number is over the desired values and 
Cj^~(t) when below. The objective function to be minimized 
is 

r 2 K ^ 

E„ Z Z {C. ( t+x)Max [0 ,G . ( t+x) -X . ( t+x) -Z . ( t+x) ] 

•-x=l 1=1 

-C^”(t+x)Mln [0,Gj^(t+x)-Xj^(t+x)-Z^(t+x) ] }J , 

where E 2 (*) denotes the expected value of (•) over the joint 
density of Z = ( Z^( t + 1 ) , Z 2 ( t + 1) , . . . Zj^(t+1 ) , Z^ ( t+2 ),. . . Zj^( t+2 )), 
a vector of random variables with a multivariate normal 
distribution. The optimal values of X^^(t+1) and Xj^(t+2) 
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would minimize this objective function. When C. =C. =C., 

Ill’ 

2 K 

the expression reduces to I! E C.(t+x) 

x=l 1=1 1 



-E[Z^(t+x)] } 



{G^(t+x) - (t + x) 



, where the expected values of Z^(t+1) and 



Z^(t+2) are given above. 

We now assume that the realizations, Y^(t) 1=1, 2,... K 

are known and proceed to determine the best predictors for 

Z^(t+1) and Z^(t+2). Assuming the normality property for. 

Z^ and , this best predictor will be the expectation of 

Z. conditioned on the values of Y., j=l,2,...l, for the 
^ J 

present time; defining E’[Z^(t+l)] as the best predictor 
of Z^(t+1) and E' ' [Zj^(t+2) ] as the best predictor of Z^(t+2) 
when the values of Y.(t) are known, j=l,2,...K, we have 

E’[Z^(t+l)] = E[Z^(t+l) I Yj(t), j=l,2,...i] 



and 



E”[Z^(t+2)] = E[Z^(t + 2) I Y^.(t), j = l,2,...l] 

where the realizations of Y.(t) for j>i have no effect on 

J 

E' and E'' and are therefore not considered In the expres- 
sions. (This follows from the fact that p..(s) Is zero 

J 

for 1< j . ) 

With the Independence of Y.(t) and Y.(t) for 1 5^ j , we 

^ J 

have that Cov[Y^(t) ,Yj (t) ] = 0 for 1 5 ^ j . Since X^(t+1) Is 

a control variable, we also have that Var [Y*^ (t+x)] = Var [Z.(t+x)] 

J 

and Cov[yJ ( t+x) ,Y. (t ) ] = Cov [Z . ( t+x) , Y. (t ) ] for x=l,2 and 

J_ J _L 

1=1, 2,... K. Extending Theorem Two of Section II to E’ and 
E*' we thus have that 
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3 



E'(z.(t + D) = E[Z.(t + l)] + Z d^^{Y.(t) - E[Y.(t)]} 

1 1 j = ^ -c j j 



and 

1 

E”(Z.(t+2)) = E[Z.(t+2)] + Z gJ^{Y.(t) - E[Y,(t)]}, 

1 1 j = l -c J J 

where 

dj^ = Cov[Y^(t+l),Y,(t)]/Var[Y,(t)] 

and 

= Cov[Y^(t+2),Y.(t)]/Var[Y.(t)] 

for 1 = l,2j...K. The terms E[Z^(t+x)]j x=l,2, and 

Var[Yj(t)] have been derived previously, leaving the terms 

Cov(Yi(t+x) ,Y. (t) ) , x=l,2, to be derived: 

J 

Given that a member who entered the organization at 

time t-s In grade j was in the organization at time t, 

(i.e. a member of X.(t-s,s)), the probability that this 

J 

member will be In grade 1 at time t+x. Is p . . (s+x)/p . (s) . 

J 1 J 

The conditional expectation E [X . . ( t-s ,s+x) | X . ( t-s , s ) ] 

J 1 J 

(that Is the expectation of the number in grade 1 at time 

t+x of those who started In grade j at time t-s, conditioned 

on the number remaining In the organization at time t who 

started at time t-s in grade j), is p . . ( s+x)X . ( t-s ,s )/p, ( s ) . 

J 1 J J 

Using a similar argument as that In deriving Cov [Y( t ) ,Y(t-l) ] 
in Section II, we have that 
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p..(s+x) p 

E[X. . (t-s,s+x) 3 X.(t-s,s)] = V --. E[(X,(t-s,s))^] 

J-*- J Pj-VOy* J 

hence 

p . . (s+x) 

Cov[X (t-s,s+x),X.(t-s,s)] = - ^- 7 ^ — Var[(X.(t-s,s)] 

= p , . (s+x)q . (s) X.(t-s). 

J J J 

Since Independence exists between X.(t-s,s) and X.(t-r,r) 

J J 

and between Xj ^^(t-s jS+x) and Xj^(t-rjr+x) for s?^ r, we have 

. M-x 

Cov[Y^(t+x)'jY. (t) ] = Z Cov [X . . ( t-s , s+x) ,X . ( t-s ) ] 

J 3=0 J 

M-x 

= Z p . . (s+x)q . (s)X. (t-s) , 

3 = 0 J J 

for X = 1,2; j=l,2,...l. 

The use of E' and E' ' in place of E(Z^(t+l)) and 
E(Z^(t+2)) respectively (using the present values of 
Y^(t)), is relevant in the objective function when 
C^(t+x) = C^"''(t+x) = C^~(t+x)j x=l,2. The variances 
(using the fact that Cov(Y. (t ) ,Y . (t ) ) is zero for If^j), 

^ J 

are 

Var’ [Z, (t+l) ] = Var[Z. (t+1)] + Z { ( d^ ^ ) ^Var [Y J t ) ] 

1 1 t J 

- 2 dJ^Cov[Y^(t + l) ,Yj. (t) ] } 
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and 



1 • • o 

Var " [Z. (t+2) ] = Var [Z . ( t+2 ) ] + E {(g^^) Var[Y.(t)) 

1 1 t j 

- 2gJ^Cov[Y^(t+2),Yj(t)]}. 

An application of such a model might be that of a uni- 
versity system with the grades defined for curriculum and 
level (upper and lower, for Instance) in which the goals 
G^(t+1) and G^(t+2) were specified to fully utilize the 
facilities without inflicting a lack of classrooms by 
over-enroElng . The costs, C^^Ct+l) and Cj^(t+2) might be 
based on the losses encurred financially by overstaffing 
for a below-deslred-level of class size and the losses 
encurred by the added administrative burdens of rejecting 
enrolled students when overages occur. 
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IV. SENSITIVITY TO ERRORS IN PROBABILITY ESTIMATION 



The fraction of remaining Individuals who start In a 
grade and remain for s epochs, s=l,2,...M Is used as an 
estimate of the probability of this event. In the paper by 
McAfee (1970), a statistical test Is made on a sample of 
size three, to test the hypothesis that p^(s) for each 
cohort Is from the same population. This Is to say that 
for moderate sample sizes In estimating p^(s), we have a 
sample mean to use, which for large sample sizes, approaches 
the true value of p^(s). At best, we use the estimate and 
for this reason examine the sensitivity of the model 
described In Section II to error which may exist between 
our estimate and the true value of p^(s); viz., the sensi- 
tivity of the expected values, E[Y(t)], E*[Y(t)] and 
E**[Y(t)] to errors In Pj^(s). We shall consider two cases: 
one. In which an error Ap^(s) exists between our estimate 
and the true value of p^^Cs) for some s and 1; the second. 

In which an error exists for all s and 1. 

Taking the partlals of E[Y(t)], E*[Y(t)] and E**[Y(t)] 

with respect to Pj^(s) yields the following expressions 

(where r denotes the partial differential of ( • ) with 

dp^(s) 

respect to Pj^(s)): 

dE[Y(t) ]/dPj^(s) = d/dp^(s) [X^(t-s)p^(s) ] = X^(t-s) • 
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dE*[Y(t)]/dp^(s) = d/dp^(s) E[Y(t)] 

+b,d/dp.(s) {Y(t-D - E[Y(t-in 

U 1 

+ {Y(t-D - E[Y(t-l)] }d/dp^(s)(b^) 

= X^(t-s) - b^Xj^(t-s-l) + Ry*; 

and 

dE**[Y(t)]/dpj. (s) = d/dp^(s)E[Y(t)] 

+ bJd/dp^(s){Y(t-l) - E[Y(t-l)]} 

+ b^d/dp^(s){Y(t-2) - E[Y(t-2)]} 

+ {Y(t-l) - E[Y(t-l)]}d/dpj.(s)(bJ) 
+{Y(t-2) - E[Y(t-2)]}d/dp^(s)(b^) 

= X^(t-s) - bjx^(t-s-l) -b^X^(t-s-2)+R**; 

where R* and R|* represent sums of remainder terms negligible 
in comparison to other terms in the expressions respectively 
for dE*/dp^(s) and dE**/dp^(s). (See the appendix for the 
exact expressions represented by R.) 

Neglecting R for small errors in the estimate for p^(s), 
the changes in predicted values are then 

AE [Y(t) ] = X^(t-s) Ap^(s), 

AE*IY(t)] = (X^(t-s) - b^X^(t-s-l)} Ap^(s), 

and 

AE**(Y(t)) = {X^(t-s) - bjx^(t-s-l) - b^Xj. (t-s-2)}Ap^(s). 
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It is observed In the expression for E*(Y(t))j that as 



approaches one, the error In the predicted value is 
APj^(s)AX^, where AX^=X^(t-s)-X^(t-s-l) . Since the covari- 
ance of Y(t) and Y(t-l) is always positive, b^ takes on a 
value between zero and one, and b^ acts as a dampening 
coefficient for errors in the predicted value of Y(t) using 
E*. 

Similarly for the expression of AE**(Y(t)), as 
1 2 

(h^+b^)^l, the error in the predicted value is 
Ap. (s)(biAX^ + b^AX?), where the terms AX^ = X.(t-s)-X.(t-s-l) 
and AX? = X^(t-s) - Xj^(t-s-2). (Viz., the differences 
between the cohort entering grade 1 at time t-s and those 
which enter the periods before.) The dampening effect of 
the error in E**[Y(t)] is evident with the b^ coefficients 
(1=1,2). Thus, a smaller error in the predicted value of 
Y(t) results when E* and E** are used rather than E. 

Let us assume now that Pj^(s) = (p^^) for all 1; l.e., 
we have a geometric distribution of remaining members of 
each grade. If X^(t) were a constant for all t (that is, 
X^(t ) = X^) , then 



K 

E[Y(t)] = Z E[Y. (t)] 
1=1 ^ 




1 = 1 



K 



Z X./(l-p.); 



K 



E*(Y(t)) = Z X^/(l-p^) + b^{Y(t-l)-X^/(l-p^)} 



1=1 



for 



b 



t 
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and 



E**(Y(t)) = _E^Xj_/(l-Pj^) + b^[Y(t-l) - X^/(l-p^)] 
+ b^[Y(t-2) - Xj_/(l-p^)], 



for 



bZ = 



K ^ K 



= [b/(l-b)][l - Z P,h./ E p.X. )] = b^ 

1=1 ^ ^ 1=1 ^ ^ 



and 



ht = [l/(l-b)] 



r K o K 

E p. ^X./( E p.X. 
Ll=l ^ ^ 1=1 ^ 



- (b)2K b2 



The resultant terms for the changes In prediction are then 



AE(Y(t)) = E Ap.X /(1-p , 

1=1 1 1 1 



AE*(Y(t)) = E Ap.X (l-b)/l-p = (l-b)AE(Y(t)), 

1 = 1 



and 

AE**(Y(t)) = E Ap.X. (l-b^-b^)/(l-p.^) = (l-bW)AE(Y(t)). 
1=1 ^ ^ 1 



The dampening factors for AE*(Y(t)) and AE**(Y(t)) are 
Immediately evident. 

Using E* or E** In place of E, we can reduce the predic- 
tion error caused by errors In p^(s). 
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V. APPLICATION TO UNIVERSITY ENROLLMENT 



The problem of predicting total student attendance at 
the University of California, Berkeley, is approached using 
the model for the best predictors, E*(Y(t)) and E**(Y(t)). 
Data was obtained from Berkeley for new student enrollments 
and is tabulated in Table I. 

Total Student predictions are required for only the 
fall quarter of each year. To estimate the p^^(s) (the 
probability of a student remaining to the s^'^ subsequent 
fall period following entry into grade 1, quarter J), data 
was collected on the numbers of students from the cohorts 
entering in the Fall of I966 and the Winter, Spring and 
Summer of I 967 which were still in attendance each succeed- 
ing fall term. The most recent data available was from 
1969 j which included at most 3 years for any cohort. To 
estimate p^^(s) for the years in attendance ^ -6, we 
assumed students’ attendance behavior over time is essen- 
tially stationary and used past cohort data analysis found 
in Suslow et al (I968). Our estimates of p^'^(s) are given 
in Table II. It should be pointed out that during the 
years I96I through 1966 , the University followed a semester 
system. Starting in 1967 j the University switched to a 
quarter system. 

The parameters calculated with the data in Tables I and 
II are shown in Table III. Our estimates of total student 



3 ^ 



TABLE I 

NEW ENROLLMENTS AT THE UNIVERSITY OP CALIFORNIA, BERKELEY 
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1966 jo 0 00 0 0 00 

1967 I 95 12^ 176 175 104 259 72 

1968 111 143 251 48 409 246 517 205 

1969 193 254 507 84 915 241 640 176 



FRACTIONS OP STUDENTS ATTENDING EACH SUCCESSIVE PALL AFTER ENROLLMENT 
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TABLE III 



VALUES USED IN PREDICTION 



CASE A 

(Summer enrollees treated 


as Fall 


enrollees ) 








Year 






1966 


1967 


1968 


1969 


E(Y(t)) 


16919.5 


18339.5 


18365.5 


18133.7 


Var(Y(t))''^ 


67.66 


67.86 


71.02 


70.83 


Cov(Y(t) ,Y(t-l) ) 




2261.1 


2186.0 


2349.6 


Cov(Y(t),Y(t- 2 )) 






984.5 


970.6 


p(Y(t),Y(t-l) 




. 49245 


.45358 


.46707 


’^t 




. 49400 


.47471 


.46586 


H 






.48641 


.46055 








-.02561 


-.01305 


Y(t)-E(Y(tn 


-172.5 


-2.518 


-374.49 


-17.742 


Var*(Y(t) 




59.06 


63.29 


62,63 


Var**(Y(t))’^ 






63.49 


62 . 63 


( Summer 


CASE B 

enrollees treated separately) 








Year 






1966 


1967 


1968 


1969 


E(Y(t)) 


16919.5 


18186.9 


17848.4 


17329.6 


Var(Y(t) 


67.66 


68.73 


73.04 


73.60 


Cov(Y(t) ,Y(t-l) ). 




2177.6 


2344.6 


2398.7 


Cov(Y(t) ,Y(t- 2 ) ) 






971.9 


1077.6 


p(Y(t),Y(t-D) 




.46828 


.46704 


.44559 


bt 




.47577 


.49637 


. 44955 


bi 






. 51040 


.44672 


bt 






. 03070 


.00637 


Y(t)-E(Y(tn 


-172.5 


150.12 


142.61 


786.42 


Var*(Y(t))'^ 




60.75 


64.59 


65.89 


Var**(Y(t))''^ 






64.74 


65.78 
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enrollment in the falls of 196? through 1969 are given in 
Table IV with actual enrollment figures for comparison. 

Due to a quota limit for fall enrollees and the start of 
year round operations in the Summer of 1968, it was felt 
that summer enrollees in 1968 and 1969 might not behave as 
summer enrollees in 1967 j but in fact be early fall appli- 
cants who enrolled in the summer rather than risk unsuc- 
cessful enrollment in fall. This fact plus the relatively 
small sample size from 1967 for estimating the probabilities 
for summer enrollees to remain in the system, leads to two 
cases for estimation: Case A, in which summer enrollees 

are treated as fall enrollees; and Case B, in which summer 
enrollees are considered separately from the fall enrollees. 
The two cases are tabulated in Tables III and IV. 
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TABLE IV 



PREDICTED TOTAL PALL ENROLLMENT 
(With plus or minus two standard deviations) 





CASE A 

(Summer Enrollees treated as 


Fall Enrollees) 




1967 


Year 

1968 


1969 


Estimate 


using E 18339±135 


18365±142 


18133±1^1 


Estimate 


using E* 182561118 


18364±126 


17959±125 


Estimate 


using E** 


18368+126 


1796l±i25 





CASE B 

(Summer Enrollees treated 


separately) 








1967 


Year 

1968 


1969 


Estimate 


using E 


18i86±137 


17848±lil6 


17329±i47 


Estimate 


using E* 


18104±122 


17992+129 


17393±131 


Estimate 


using E** 




17919±129 


17394±131 


Actual Total Enrollment 


18337 


17991 


I 8 II 6 
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APPENDIX A 



REMAINDER TERMS TO PREDICTION ERROR 



In Section IV, expressions for dE*/dp^(s) and dE**/dp^(s) 
are given In which small remainder terms are represented by 
R^ and R** respectively. The term R* represents 
{Y(t-l) - E[Y(t-l)] }db^/dp^ ( s ) where 



db^dp^(s) = 



{X^(t-s-l)q^(s-l)-Xj^( t-s )p^ (s + l)-b^X^( t-s-1) [l-2p^(s)]} 

Var [Y(t-D ] 



The term R** represents the sum, 

{Y(t-l)-E[Y(t-l)] }dbj/dp^(s) + { (Y(t-2)-E [Y(t-2) ] }db^/dp^(s) . 
dbJ/dE.(s) = var[Y(t-l)i 

Var[Y(t-l)] ’ 



db 



^/dp^ ( s ) = var [Y('t-2)] [Y ( t ) ,,Y ( t-2 )]-b^b^_^Var [YCt^DlKD^-D^) 



Var[Y(t-2)] ^‘^1 ^2 *^3 



where 



Ap = var[Y(t-l) ] tX^(t-s-l)q^(s-l) - X^ ( t-s ) p^ ( s+1 ) ] 
A^ = b^Xj_(t-s-l) [l-2p^(s)] 



A^ = b^_^[X^(t-s-2)q^(s-2) - X^ ( t-s )p^ ( s+2 ) ] 
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= ‘^°^Var[Yj ' ^-2)] {Xj^(t-s-2)qj^(s-l) - Xj^(t-s-l)p^(s+l) } 

^ Cov[Y(t).Y(t-2)] Var[Y(t)] . v (t ?)n ?n ( 

H Var[Y(t-2)] Var [Y(t-2)] t-s-2) [l-2p^(s)] 



A. = 



Cov[Y(t) .Y(t-2)] , 

Var[^t-2)] b^_^X^(t-s) [l-2p^(s)] 



D. = 



Do = 



X^(t-s-2){2q^(s-l) + b^_^[2q^(s-l)] } 
X,(t-s-l) [2p,(s-H) + b^_^{2q^(s-D) 



*^1 Var [Y(t-2) ] ^^i(b-s-2)q^(s-2) - X^( t-s )p^(s+2) } 

=2 = °°''IIrlYjt- 2 )l*’ X^(t-s-2)ll-2Pi(s)l 



C 3 = b^[X^(t-s-2){q^(s-l) - b^_^ + 2p^(s)} - X^(t-s-l)p^(s+l)] 



C|^ = b^_^[X^(t-s-l) {q^(s-l) - b^ + 2pj^(s) } - X^ ( t-s )p^ ( s+1 ) ] 



and 

A = l/(l-p^[Y(t-l) ,Y(t-2)] ) . 



^1 
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started. For large cohort sizes the total personnel in 
the system is approximately normally distributed. This 
result justifies the use of a best linear prediction method 
which takes into account past errors of estimating the 
continuing population from one period to the next. Sensi- 
tiylty of predictions to errors in probability estimates 
is • discussed. The model is applied to predicting university 
student enrollment. Comparison of predicted and actual 
student enrcllrrent 1s Include-d. 
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